Durbin–Watson statistic について

Words near each other

・ Durbe Station
・ Durbeen
・ Durbeke
・ Durbeni
・ Durbin
・ Durbin (surname)
・ Durbin Amendment
・ Durbin and Greenbrier Valley Railroad
・ Durbin Crossing, Florida
・ Durbin test
・ Durbin Ward
・ Durbin, Indiana
・ Durbin, Kentucky
・ Durbin, North Dakota
・ Durbin, West Virginia
・ Durbin–Watson statistic
・ Durbin–Wu–Hausman test
・ Durbok
・ Durborow
・ Durbuy
・ Durcet
・ Durch den Monsun
・ Durch die Nacht mit …
・ Durch die Wüste
・ Durch fremde Hand
・ Durcha
・ Durchak
・ Durchgangsgüterzug
・ Durchhausen
・ Durchhim

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

Durbin–Watson statistic ：ウィキペディア英語版

Durbin–Watson statistic

In statistics, the Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation (a relationship between values separated from each other by a given time lag) in the residuals (prediction errors) from a regression analysis. It is named after James Durbin and Geoffrey Watson. The small sample distribution of this ratio was derived by John von Neumann (von Neumann, 1941). Durbin and Watson (1950, 1951) applied this statistic to the residuals from least squares regressions, and developed bounds tests for the null hypothesis that the errors are serially uncorrelated against the alternative that they follow a first order autoregressive process. Later, John Denis Sargan and Alok Bhargava developed several von Neumann–Durbin–Watson type test statistics for the null hypothesis that the errors on a regression model follow a process with a unit root against the alternative hypothesis that the errors follow a stationary first order autoregression (Sargan and Bhargava, 1983). Note that the distribution of this test statistic does not depend on the estimated regression coefficients and the variance of the errors.
==Computing and interpreting the Durbin–Watson statistic==

If ''e_t'' is the residual associated with the observation at time ''t'', then the test statistic is
:

d = )^2 \over },

where ''T'' is the number of observations. Note that if one has a lengthy sample, then this can be linearly mapped to the Pearson correlation of the time-series data with its lags.〔http://statisticalideas.blogspot.com/2014/05/serial-correlation-techniques.html〕 Since ''d'' is approximately equal to 2(1 − ''r''), where ''r'' is the sample autocorrelation of the residuals,〔Gujarati (2003) p. 469〕 ''d'' = 2 indicates no autocorrelation. The value of ''d'' always lies between 0 and 4. If the Durbin–Watson statistic is substantially less than 2, there is evidence of positive serial correlation. As a rough rule of thumb, if Durbin–Watson is less than 1.0, there may be cause for alarm. Small values of ''d'' indicate successive error terms are, on average, close in value to one another, or positively correlated. If ''d'' > 2, successive error terms are, on average, much different in value from one another, i.e., negatively correlated. In regressions, this can imply an underestimation of the level of statistical significance.
To test for positive autocorrelation at significance ''α'', the test statistic ''d'' is compared to lower and upper critical values (''d_L,α'' and ''d_U,α''):
:
*If ''d'' < ''d_L,α'', there is statistical evidence that the error terms are positively autocorrelated.
:
*If ''d'' > ''d_U,α'', there is no statistical evidence that the error terms are positively autocorrelated.
:
*If ''d_L,α'' < ''d'' < ''d_U,α'', the test is inconclusive.
Positive serial correlation is serial correlation in which a positive error for one observation increases the chances of a positive error for another observation.
To test for negative autocorrelation at significance ''α'', the test statistic (4 − ''d'') is compared to lower and upper critical values (''d_L,α'' and ''d_U,α''):
:
*If (4 − ''d'') < ''d_L,α'', there is statistical evidence that the error terms are negatively autocorrelated.
:
*If (4 − ''d'') > ''d_U,α'', there is no statistical evidence that the error terms are negatively autocorrelated.
:
*If ''d_L,α'' < (4 − ''d'') < ''d_U,α'', the test is inconclusive.
Negative serial correlation implies that a positive error for one observation increases the chance of a negative error for another observation and a negative error for one observation increases the chances of a positive error for another.
The critical values, ''d_L,α'' and ''d_U,α'', vary by level of significance (''α''), the number of observations, and the number of predictors in the regression equation. Their derivation is complex—statisticians typically obtain them from the appendices of statistical texts.
If the design matrix

\mathbf

of the regression is known, exact critical values for the distribution of

d

under the null hypothesis of no serial correlation can be calculated. Under the null hypothesis

d

is distributed as
:

\frac \nu_i \xi_i^2} \xi_i^2},

where ''n'' are the number of observations and ''k'' the number of regression variables; the

\xi_i

are independent standard normal random variables; and the

\nu_i

are the nonzero eigenvalues of

( \mathbf - \mathbf ( \mathbf^T \mathbf ) ^ \mathbf^T ) \mathbf,

where

\mathbf

is the matrix that transforms the residuals into the

d

statistic, i.e.

d = \mathbf^T\mathbf\mathbf.

. A number of computational algorithms for finding percentiles of this distribution are available.
Although serial correlation does not affect the consistency of the estimated regression coefficients, it does affect our ability to conduct valid statistical tests. First, the F-statistic to test for overall significance of the regression may be inflated under positive serial correlation because the mean squared error (MSE) will tend to underestimate the population error variance. Second, positive serial correlation typically causes the ordinary least squares (OLS) standard errors for the regression coefficients to underestimate the true standard errors. As a consequence, if positive serial correlation is present in the regression, standard linear regression analysis will typically lead us to compute artificially small standard errors for the regression coefficient. These small standard errors will cause the estimated t-statistic to be inflated, suggesting significance where perhaps there is none. The inflated t-statistic, may in turn, lead us to incorrectly reject null hypotheses, about population values of the parameters of the regression model more often than we would if the standard errors were correctly estimated.
If the Durbin–Watson statistic indicates the presence of serial correlation of the residuals, this can be remedied by using the Cochrane–Orcutt procedure.
It is important to note that the Durbin–Watson statistic, while displayed by many regression analysis programs, is not applicable in certain situations. For instance, when lagged dependent variables are included in the explanatory variables, then it is inappropriate to use this test. Durbin's h-test (see below) or likelihood ratio tests, that are valid in large samples, should be used.
==Durbin h-statistic==
The Durbin–Watson statistic is biased for autoregressive moving average models, so that autocorrelation is underestimated. But for large samples one can easily compute the unbiased normally distributed h-statistic:
:

h = \left( 1 - \frac   d \right) \sqrt}(\widehat\beta_1\,)}},

using the Durbin–Watson statistic ''d'' and the estimated variance
:

\widehat (\widehat\beta_1)

of the regression coefficient of the lagged dependent variable, provided
:

T \cdot \widehat(\widehat\beta_1)<1. \,

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「Durbin–Watson statistic」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース